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Abstract  —  Matrix  factorization  is  applied  to  unsupervised  linear 
unmixing  for  hyperspectral  imagery.  The  algorithm,  called  non¬ 
negative  matrix  factorization,  is  used.  It  imposes  a  constraint  on 
the  non-negativity  of  the  amplitudes  of  the  recovered  endmember 
spectral  signatures  as  well  as  their  fractional  abundances.  This 
ensures  the  recovery  of  physically  meaningful  endmembers  and 
their  abundances.  This  algorithm  is  further  modified  to  include 
the  sum-to-one  constraint  such  that  the  sum  of  the  fractional 
abundances  for  each  pixel  is  unity.  Several  practical 
implementation  issues  in  hyperspectral  image  unmixng  are 
discussed.  Some  preliminary  results  from  AVIRIS  experiments 
are  presented.  We  also  discuss  the  advantages  and  possible 
limitations  of  this  method  in  hyperspectral  image  analysis. 
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I.  Introduction 

Linear  unmixing  analysis  is  a  well-known  technique  in 
remote  sensing  image  analysis.  It  is  based  on  the  fact  that  the 
rough  spatial  resolution  permits  different  materials  present  in 
the  area  covered  by  a  single  pixel.  The  linear  mixture  model 
says  that  a  pixel  reflectance  in  a  visible-near  infrared 
multispectral  or  hyperspectral  image  is  the  linear  mixture  from 
all  independent  pure  materials  (endmembers)  present  in  an 
image  scene  [1]. 

Let  L  be  the  number  of  spectral  bands  and  r  an  Zxl 
column  pixel  vector  in  a  multispectral  or  hyperspectral  image. 
Assume  that  there  are  P  objects/materials  (i.e.,  endmembers) 
present  in  an  image  scene,  which  construct  an  LxP  signature 
matrix  M  =  [itijirij  where  m;  represents  the  j- th 

endmember.  Assume  that  a  =  (axa2  ■  ■  ■  cc P)T  is  a  px  1 
abundance  vector  associated  with  r,  where  a j  denotes  the 
abundance  fraction  of  the  m(  in  r.  In  the  linear  mixture 

model,  r  is  considered  as  the  linear  mixture  of  n^,  m2,  ...,  mP 
as 

r  =  Ma  +  n  (1) 

where  n  is  included  to  account  for  either  measurement  or 
model  error  [1],  If  M  is  assumed  to  be  known  and  keeps  to  be 


the  same  for  all  the  pixels,  then  the  problem  is  to  estimate  a 
which  is  changed  with  pixel. 

A  typical  method  to  estimate  a  is  the  least  squares 
approach.  The  estimate  from  the  least  squares  solution  is  the 
one  that  minimizes  the  estimation  residual 

min(r-Ma)r(r-Ma) .  (2) 

a 

In  order  for  the  estimated  abundance  vector  a  to  faithfully 
represent  an  image  pixel  vector  r,  two  constraints  are  generally 
imposed  on  a  in  Eq.  (1):  (a)  abundance  sum-to-one  constraint, 

p 

referred  to  as  ASC,  ^orp=l;  and  (b)  abundance  non- 

P=\ 

negativity  constraint,  referred  to  as  ANC,  a p  >  0  for  all 
1  <  p  <  P .  There  is  no  closed-form  solution  to  such  a 
constrained  linear  unmixing  problem.  So  an  iterative  method 
generally  is  used. 

In  real  applications,  endmember  signatures  in  M  may  be 
unknown  and  difficult  to  obtain.  Then  the  task  is  much  more 
challenging  since  both  M  and  a  need  to  be  estimated. 
Intuitively,  the  M  matrix  should  be  nonnegative.  Otherwise, 
the  solution  may  not  be  physically  realistic. 

II.  NON-NEGATIVE  MATRIX  FACTORIZATION 

The  linear  unmixing  problem  in  Eq.  (1)  can  also  be  solved 
by  matrix  factorization.  Recently,  non-negative  matrix 
factorization  (NMF)  is  developed  [2-7].  Given  a  non-negative 
data  matrix  X  of  size  LxIJ,  NMF  can  find  an  approximate 
factorization  X  =  AS  into  non-negative  factors  A  of  size  LxP 
and  S  of  size  PxIJ.  The  non-negativity  constraint  imposed  on 
A  and  S  makes  them  purely  additive,  in  contrast  to  other  factor 
analysis  techniques  such  as  principal  component  analysis 
(PCA)  and  independent  component  analysis  (ICA).  Let  IJ  be 
the  number  of  pixels  in  an  image  of  size  IxJ,  and  let 

X  =  {r,  }2f ,  be  the  data  matrix  including  all  the  pixel  vectors. 
Then  the  linear  mixture  model  in  Eq.  (1)  can  be  represented  as 

X  =  AS  +  N  (2) 
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where  A  corresponds  to  the  endmember  matrix  M,  S  =  {a,-}M 
is  the  abundance  matrix  with  the  z-th  column  vector 
representing  the  abundances  for  the  z'-th  pixel,  and  N  =  {n}^=1 . 
Eq.  (2)  will  be  used  thereafter. 

The  formulation  of  the  NMF  algorithm  of  Lee  and  Seung 
in  [4]  is  reviewed  as  follows.  Assume  noise  is  Gaussian 
distributed.  The  maximum  likelihood  estimation  of  A  and  S 
are 


A,S  =  argmax  p(X  |  A,S) 

A,S 

=  argmin(-  log p(X  |  A,S))  (3) 

A,S 

=  argmin||X- AS||“ 

A,S 

subject  to  A  >  0 ,  S  >  0 


Define  F  =  ||X-  AS||“  .  The  gradients  with  respect  to  A  and  S 
are  given  by 


dF 

dAi,P 

dF 

dSpJ 


=-4™rL-hsT)J 

=  -2«Arx);,,-(ArAs)p,)  (4) 


where  l  =  p  =  1, and  i  =  are  indexes  for 

A  and  S.  Then  the  adaptation  functions  can  be  constructed  as 


Ai,P  AUp  +  SLP  lxSr  \,P  -  [aSST)i'P  ) 

Spj  <-  SpJ  +  UPJ  ((Arx)P;1.  -  (Ar As)p, )  (5) 


where  S;  p  and  1J p  are  learning  rates.  In  [4]  it  is  shown  that 
by  choosing  them  as  8l  p  =  Al  p  /(aSS  t  );  and 

ijp  i  =  Sp  i  /[at  As)  i  ,  the  adaptation  functions  become 


(xsrl 


M 


SpJ 


(6) 


which  is  a  multiplicative  update  rule. 

The  update  rule  in  Eq.  (6)  is  proved  to  converge  to  at  least 
a  local  optimal  maximum  likelihood  solution  [5].  We  can  see 
that  the  advantage  of  using  such  a  multiplicative  update  rule  in 
Eq.  (6)  over  the  additive  rule  in  Eq.  (5)  is  the  guarantee  of 
non-negativity  of  A  and  S,  provided  that  they  are  initiated 
non-negative  and  the  data  matrix  X  is  non-negative.  Also,  it  is 
unnecessary  to  choose  any  learning  rate. 


III.  CONSTRAINED  MATRIX  FACTORIZATION  FOR 
HYPERSPECTRAL  IMAGE  LINEAR  UNMIXING 

The  NMF  algorithm  has  been  applied  to  some  real 
applications  [8-10].  Here  we  discuss  several  practical 
implementation  issues  when  it  is  applied  to  linear  unmixing  of 
hyperspectral  imagery. 

Sum-to-one  Constraint 

The  NMF  needs  to  be  modified  such  that  the  sum-to-one 
constraint  can  be  satisfied.  This  can  be  achieved  by  adding  one 
more  row  of  the  data  matrix  X  and  A  as  1,  i.e., 

S  +  N  =  AS  +  N  (6) 

where  1  and  1  are  column  vectors  of  size  IJx  1  and  Pxl, 
respectively,  with  all  the  elements  equal  to  1 .  Then  the  last  row 

of  A  will  not  participate  the  adaptation. 

Estimation  of  the  Number  of Factors 

The  number  of  factors  P  ought  to  be  used  is  unknown, 
which  is  the  number  of  endmembers  present  in  an  image  scene. 
If  the  value  of  P  is  changed,  then  the  final  results  of  A  and  S 
are  obviously  different.  This  is  a  common  problem  for  any 
factor  analysis  based  technique. 

The  hypothesis  testing  based  eigen-thresholding  method  in 
[11]  can  be  applied  to  estimate  the  number  of  distinctive 
signals  in  an  image  scene,  and  this  number  is  used  as  P. 

Algorithm  Initiation 

The  algorithm  may  be  sensitive  to  the  initial  condition. 
There  are  different  ways  to  initialize  the  LxP  non-negative  A 
matrix:  1)  A  is  randomly  created  within  the  dynamic  range  of 
the  image;  2)  P  pixel  vectors  are  uniformly  selected  from  the 
image  scene  as  an  initial  A;  3)  Perform  vector  quantization 
(VQ)  first  and  the  P  clusters  are  used  to  construct  the  A.  In  our 
research,  we  use  the  first  method  to  randomly  initialize  the  A. 
After  A  is  initiated,  S  is  initiated  using  the  estimate  from  the 
fully  constrained  least  squares  linear  unmixing  method  in  [12]. 

Data  Pre-processing 

All  the  water  absorption  bands  and  low  signal-to-noise  ratio 
bands  are  removed  before  linear  unmixing.  Due  to  sensor 
noise,  some  pixel  elements  may  have  negative  values,  which 
are  replaced  with  0. 

To  summarize,  the  steps  for  estimating  the  endmember 
matrix  A  and  abundance  matrix  S  are  listed  as  follows. 

1 )  Construct  the  data  matrix  X  by  removing  bad  bands,  bad 
pixel  points,  and  adding  the  last  row  as  1 . 

2)  Estimate  the  number  of  factors  P. 

3)  Initiate  A  as  an  LxP  matrix  and  adding  the  last  row  as  1, 
and  initiate  S  by  finding  the  FCLSLU  estimate. 


4)  Update  A  and  S  using  Eq.  (6)  with  the  last  row  of  A 
unchanged. 

IV.  Experiment 

The  AVIRIS  Lunar  Lake  image  data  was  used  in  computer 
simulation.  The  original  200x200  subimage  was  shown  in 
Figure  1.  According  to  the  prior  information,  there  are  five 
materials  in  this  scene.  So  in  this  preliminary  experiment  we 
set  P  equal  to  5.  Then  five  endmember  signatures  were 
randomly  initialized.  After  about  1000  iterations,  the  five 
fractional  abundance  images  were  generated  as  shown  in 
Figure  2.  The  algorithm  actually  converged  very  fast,  because 
after  about  50-100  iterations  there  were  no  obviously 
difference  in  the  classified  images  although  the  estimation 

error  F  =  X  —  AS  continued  to  gradually  decrease. 


& 

A 

Figure  1 .  A  band  subimage  of  AVIRIS  Lunar  Lake  scene. 
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Figure  2.  Classification  results  using  constrained  matrix  factorization. 


Compared  to  the  unsupervised  least  squares  result  [12],  we 
can  see  that  ‘‘Vegetation”,  “Shade”  and  “Cinder”  were 
correctly  classified,  but  “Rhyolite”  and  “Playa  lake”  were  not 
well  separated. 

VII.  Discussion 

The  preliminary  experiment  shows  that  the  constrained 
matrix  factorization  algorithm  may  be  feasible  to  linear- 
unmixing  based  hyperspectral  image  classification,  but  its 
performance  needs  to  be  further  improved.  The  major  problem 
is  that  the  algorithm  is  sensitive  to  the  initial  conditions.  How 
to  find  an  appropriate  A  initial  matrix  should  be  investigated. 

The  algorithm  in  this  paper  basically  is  maximum 
likelihood  estimation  with  the  assumption  that  noise  is 
Gaussian  distributed.  But  we  know  that  noise  may  not  be  well 
modeled  by  Gaussian  distribution.  So  a  better  noise  model  is 
required  to  improve  the  overall  performance  of  this  algorithm 
for  hyperspectral  image  analysis. 
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